Mathematics has always been about understanding structures and patterns.
Ancient Greek mathematicians studied geometry and number theory, while modern mathematicians explore abstract algebra, topology, and category theory.
However, for the longest time, there was no formal framework to study these structures and patterns.
In other words, we do not have a formal language to describe what a "structure" or "pattern" is.
This changed in the late 19th and early 20th centuries with the development of set theory by Georg Cantor and others.
Set theory provides a formal framework to study collections of objects, which we call sets.
As we are about to study heavily abstract topics in theoretical physics --- think fiber bundles, Lie groups, measure theory, and functional analysis --- it is essential to have a solid understanding of set theory.
Also, the section and theorem numbering follows Hassani's book.
You can consider this set of notes to be my annotated version, or a different perspective, of the book.
A set is a collection of distinct objects, considered as an object in its own right.
A singleton is a set with exactly one element, and an empty set is a set with no elements, denoted by .
It is defined as
If and are sets, then is a subset of , denoted by , if every element of is also an element of .
Two sets and are equal, denoted by , if they have the same elements, i.e., and .
Set operations include:
In a specific theory, we often have a universal set that contains all the objects under consideration.
The power set of a set , denoted by , is the set of all subsets of .
For example, if , then .
By Cantor's theorem, the power set of a set has a strictly greater cardinality than itself.
For example, the set of natural numbers is countably infinite, while its power set is uncountably infinite.
The Cartesian product of two sets and , denoted by , is the set of all ordered pairs where and :
If is an equivalence relation on a set , then , either or .
Proof. Let be an element in . Obviously, either or .
Let's consider the two cases.
First, if , then by definition of equivalence class, .
Since , we also have , and by symmetry .
By transitivity, we have .
Then, for any , we have . By transitivity, , so .
Thus .
By symmetry of the argument, we also have .
Therefore, .
Second, we have . For any , we have .
By transitivity, if , then and thus since , we have , which contradicts the assumption that .
Therefore, for all , and thus .
The equality of real numbers is an equivalence relation.
Comparison of real numbers (i.e., , , , ) is not an equivalence relation.
Congruence modulo on the set of integers is an equivalence relation.
Similarity of triangles is an equivalence relation.
In electromagnetism, two vector potentials and are related by a gauge transformation if there exists a scalar function such that . This relation is an equivalence relation.
A partition of a set is a collection of non-empty subsets of such that
,
for all .
In other words, a partition of a set is a way of dividing into a collection of smaller sets that do not overlap and together cover the entire set .
For example, the set of integers can be partitioned into even and odd integers.
In basic calculus, we partition the real line into intervals to define the Riemann integral.
There is a deep connection between equivalence relations and partitions.
Let be an equivalence relation on a set .
Then, the set of equivalence classes forms a partition of .
Conversely, given a partition of a set , we can define an equivalence relation on by saying that if and only if and belong to the same subset of the partition.
To put this more concretely, consider an example. Suppose we have and the equivalence relation of congruence modulo .
The equivalence classes are , , and .
These equivalence classes form a partition of , as they are non-empty, disjoint, and their union is .
The collection is called the quotient set of by the equivalence relation , denoted by .
Now that we have a solid understanding of sets and relations, how do we relate different sets to each other?
One way is through the concept of a map.
A map from a set to a set , denoted by , is a rule that assigns to each element an element , denoted by .
Each element corresponds to exactly one element , but an element can correspond to multiple elements in .
Three important terms are:
The set is called the domain of the map .
The set is called the codomain of the map .
If , then is called the image of under , and is called a preimage of .
We notate the image as , and the preimage as .
If a map's codomain is a set of numbers (e.g., , , ), then the map is usually called a function.
The identity map on a set , denoted by , is the map that assigns each element to itself, i.e., for all .
The graph of , written as , is the set of ordered pairs .
In elementary algebra, we often visualize the graph of a function by plotting the points in the Cartesian plane.
If we have two maps and , we can compose them to get a new map , defined by for all .
Function composition is associative, i.e., if we have three maps , , and , then
The identity map acts as the identity element for function composition, i.e., for any map , we have
(These properties make the collection of all sets and maps between them into a category, which we will explore in more detail later.)
A map is called injective (or one-to-one) if for all , implies .
In other words, different elements in the domain map to different elements in the codomain.
When visualized in the Cartesian plane, an injective function passes the horizontal line test.
A map is called surjective (or onto) if for every , there exists at least one such that .
In other words, every element in the codomain has at least one preimage in the domain.
The sine function is not surjective when considered as a map from to , but it is surjective when considered as a map from to the interval .
A map is called bijective if it is both injective and surjective.
A bijective map establishes a one-to-one correspondence between the elements of the domain and codomain.
A bijective map has an inverse map , defined by if and only if .
The inverse map satisfies the properties and .
If we consider , we can define an equivalence relation on by saying that if and only if , where is another equivalence relation on (usually the equality relation).
For example, consider the map defined by .
Then, we have because .
This is obviously not a bijective map, as it is not injective.
Generally, for any , can be any element in .
To make bijective, we can restrict the domain to a subset of that contains exactly one representative from each equivalence class.
For example, if we restrict the domain of to , then becomes bijective.
In fact, for any , we can define another map by , known as the quotient map.
This map is always injective, as different equivalence classes map to different elements in .
However, it may not be surjective, as some elements in may not have a preimage in .
Proof. This is quite obvious from the definition of the quotient map.
We have already established that is injective.
To show that it is surjective, let .
By definition of the image, there exists such that .
Then, we have .
Thus, every element in has a preimage in .
Therefore, is surjective.
There are some very important examples of maps and quotient maps:
Consider the sine and cosine functions, and .
These functions are not bijective when considered as maps from to , but they become bijective when restricted to the interval for sine and for cosine.
The inverse functions, and , are called the arcsine and arccosine functions, respectively.
For any set with equivalence relation , the canonical projection map is defined by .
This is a very important map, as it allows us to "collapse" the set into its equivalence classes.
The canonical projection map is always surjective, as every equivalence class has at least one representative in .
However, it may not be injective, as different elements in may belong to the same equivalence class.
The quotient map is the identity map, which is obviously bijective.
Also, domains of maps can be Cartesian products of sets.
The dot product is given by , where is a real vector space.
A few important examples of maps with Cartesian product domains include:
A binary operation on a set is a map .
Examples include addition and multiplication on , , and .
Binary operations with certain properties give rise to algebraic structures such as groups, rings, and fields.
A bilinear map between two vector spaces and over a field is a map that is linear in each argument.
Examples include the dot product and the cross product in .
These form the basis of multilinear algebra and tensor analysis.
Another important structure on a set is a metric.
Sets themselves do not have any notion of distance or closeness between elements.
To make things physically relevant or even useful at all, we need to endow additional structures to sets.
A metric on a set is a function that satisfies the following properties for all :
Non-negativity: , and iff .
Symmetry: .
Triangle inequality: .
Sometimes, the non-negativity property does not hold for some metrics, to which case we call them pseudometrics.
Our universe is a pseudometric space, as the distance between two points in spacetime can be zero even if they are not the same point (i.e., for lightlike separated points).
Anyways, a set equipped with a metric is called a metric space, denoted by .
For or , the absolute value metric is defined by .
For to be the surface of a 2-sphere in , the great-circle distance metric is defined by , where is the radius of the sphere and is the angle between the vectors from the center of the sphere to points and .
For (the set of all continuous functions on the interval ), the metric
defines a metric space.
For , a bounded continuous function space, we can define
which is called the uniform metric.
In , we can define the Euclidean metric by
where and .
In , we can also define the Manhattan metric by
where and .
This metric is named after the grid-like street layout of Manhattan, where the distance between two points is the sum of the absolute differences of their coordinates.
It is used in various applications, including urban planning, robotics, and machine learning.
A sequence in a set is a function , where is the set of natural numbers.
Typically, the value of is denoted by , and the set as .
A sequence in a metric space is said to converge to a point if for every , there exists a natural number such that for all , .
The sequence converges to , denoted by or as .
In other words, a sequence converges to a point if the terms of the sequence get arbitrarily close to that point as increases.
Unfortunately, we can't readily tell if a sequence converges just by looking at it.
A weaker notion is that of a Cauchy sequence.
A sequence in a metric space is called a Cauchy sequence if for every , there exists a natural number such that for all , .
In other words, a Cauchy sequence is a sequence where the terms get arbitrarily close to each other as increases.
Every convergent sequence is a Cauchy sequence, but the converse is not necessarily true.
Proof. Let be a convergent sequence in a metric space that converges to a point .
This obviously means that for every , there exists a natural number such that for all , .
We add a for convenience; scaling by a constant does not change the definition.
Now, recall the triangle inequality property of a metric and apply it to :
For all , we have and , as it is convergent.
Thus, we have
This means that for every , there exists a natural number such that for all , .
Thus, all convergent sequences are Cauchy sequences.
Next, we show that not all Cauchy sequences are convergent.
A simple way to prove is by counterexample.
Consider the metric space , where is the absolute value metric.
Consider the sequence defined by successive approximations of :
This sequence is a Cauchy sequence, as for every , we can find a natural number such that for all , .
However, this sequence does not converge in , as is not a rational number.
This shows that not all Cauchy sequences are convergent.
If we want all Cauchy sequences to be convergent, we need to impose an additional condition on the metric space.
A metric space is called complete if every Cauchy sequence in converges to a point in .
Any incomplete metric space can be "completed" by adding the limit points of all Cauchy sequences.
For example, the set of rational numbers is not complete, as the sequence does not converge in .
However, we can complete by adding the irrational numbers, resulting in the set of real numbers , which is complete.
Finally, we discuss the concept of cardinality, which is a measure of the "size" of a set.
Infinity has always been a tricky concept in mathematics.
The ancient Greeks were aware of the concept of infinity, but they did not have a rigorous way to deal with it.
The modern treatment of infinity began with Georg Cantor in the late 19th century, who developed set theory and introduced the concept of cardinality.
With a formal definition of cardinality and related concepts, we resolve paradoxes such as Zeno's paradox and Hilbert's paradox of the Grand Hotel.
For finite sets, the cardinality is simply the number of elements in the set.
For infinite sets, we need a more sophisticated approach.
A set is said to be countably infinite if there exists a bijective map , where is the set of natural numbers.
In other words, a set is countably infinite if its elements can be put into a one-to-one correspondence with the natural numbers.
Examples of countably infinite sets include the set of integers and the set of rational numbers .
A set is said to be uncountably infinite if it is infinite, but there does not exist a bijective map .
In other words, a set is uncountably infinite if its elements cannot be put into a one-to-one correspondence with the natural numbers.
An example of an uncountably infinite set is the set of real numbers .
We can make a visual demonstration to show that has a bijection with . Let's create an infinite matrix, where the rows and columns are indexed by natural numbers, and the entry at row and column is the rational number .
Notice that if we traverse the matrix in a diagonal manner, we can list all the rational numbers.
We can start from the top-left corner and move diagonally down to the right, then move to the next diagonal and repeat the process.
This way, we can list all the rational numbers in a sequence, showing that is countably infinite.